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Abstract 

This paper presents a logic language for expressing MV search and optimization problems. 
Specifically, first a language obtained by extending (positive) DATALOG with intuitive and 
efficient constructs (namely, stratified negation, constraints and exclusive disjunction) is 
introduced. Next, a further restricted language only using a restricted form of disjunction 
to define (non-deterministically) subsets (or partitions) of relations is investigated. This 
language, called MVDatalog, captures the power of DATALOG^ in expressing search and 
optimization problems. A system prototype implementing MV Datalog is presented. The 
system translates MV "Datalog queries into OPL programs which are executed by the ILOG 
OPL Development Studio. Our proposal combines easy formulation of problems, expressed 
by means of a declarative logic language, with the efficiency of the ILOG System. Several 
experiments show the effectiveness of this approach. 

KEYWORDS: Logic languages, stable model semantics, constraint programming, expres- 
sivity and complexity of declarative query languages. 



1 Introduction 

It is well-known that MV search problems can be formulated by means of 
DATALOG* 1 (Datalog with unstratified negation) queries under non-deterministic sta- 
ble model semantics so that each stable model corresponds to a possible solution 
(Marek and Truszczynski 1991 \Sacca 1997} . MV optimization problems can be for- 



mulated by adding a max (or min) construct to select the stable model (thus, the 
solution) which maximizes (resp., minimizes) the result of a polynomial function 
applied to the answer relation. For instance, consider the Vertex Cover problem of 
the following example. 

Example 1 

Given an undirected graph G = (N, E), a subset V of the vertexes N is a vertex 
cover of G if every edge of G has at least one end in V. The problem can be 
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formulated in terms of the query (%jv(X)), where T^TJ is the following DATALOG^ 
program: 

v(X) <- node(X), -.nv(X)- 
nv(X) <- node(X), -iv(X)- 
c «- edge(X,Y),^v(X),^v(Y), -.c- 

and the predicates node and edge define, respectively, the vertexes and the edges 
of the graph by means of a suitable number of facts. The first two rules define 
a partition of the relation node (v being the vertex cover), whereas the last one 
enforces every stable model to correspond to some vertex cover as it is satisfied only 
if the conjunction edge(X, Y), _| v(X), _| v(Y) is false (otherwise the program does not 
have stable models). 

The min vertex cover problem can be expressed by selecting a stable model 
which minimizes the number of elements in v; this is expressed by means of the 
query (7£rj,min|v(X)|). □ 

The problem in using DATALOG -1 to express search and optimization problems is 
that the use of unrestricted negation is often neither simple nor intuitive and besides 
it does not allow the expressive power and complexity of queries to be limited. For 
instance, in the example above, the use of explicit constraints instead of standard 
rules would permit the distinction between rules used to infer true atoms and rules 
used to check properties to be satisfied. 

In this paper, in order to enable a simpler and more intuitive formulation for 
search and optimization problems and an efficient computation of queries, DATALOG- 
like languages extending positive DATALOG with intuitive and efficient constructs 
are considered. The first language we present, denoted by DATALOG -15 '®'^, ex- 
tends the simple and intuitive structure of DATALOG" 1 ' (DATALOG with stratified 
negation (lUllman 1988§) with two other types of 'controlled' negation: rules with 
exclusive disjunctive heads and constraint rules. The same expressive power as 
DATALOG -1 is achieved by such a language. Next, we propose a further restricted 
language, called AfVDatalog, where head disjunction is only used to define (non- 
deterministically) partitions of relations. This language allows us to express, in a 
simple and intuitive way, both MV search and optimization problems. As an exam- 
ple, let us consider again the Vertex Cover problem. 

Example 2 

The search query of the previous example can be expressed as (V2,v(X)) with Vi 
defined as follows: 

v(X) © nv(X) <- node(X)- 
^edge(X,Y),-v(X),-v(Y)- 

where © denotes exclusive disjunction, i.e., if the body of the rule is true, then 
exactly one atom in the head is true. The rule with empty head defines a constraint, 
i.e., a rule which is satisfied only if the body is false. The first rule guesses a partition 
of node whereas the second one is a constraint stating that two connected nodes 
cannot be both outside the cover, which is defined by the nodes belonging to v. □ 
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Contribution. 

The main contribution of this paper is the proposal of a simple and intuitive lan- 
guage where the use of stable model semantics allows us to refrain from uncontrolled 
forms of unstratified negation^] and avoid both undefinedness and unnecessary com- 
putational complexity. 

More precisely, the paper presents the language AfP "D atalog , which extends 
DATALOG^-' with constraints and head disjunction, where the latter is used only 
to define (non-deterministically) partitions of "deterministic" relations. This lan- 
guage allows both AfP search and optimization problems to be expressed in a simple 
and intuitive way. 

The simplicity of the AfP V atalog language enables queries to be easily transla- 
ted into other formalisms such as constraint programming languages, which are 
well-suited to compute programs defining AfP problems. This paper also shows 
how AfP T> atalog queries can be translated into OPL (Optimization Programming 
Language) \Van Hentenryck 1988^\Van Hentenryck et al. 1999) programs. 

Several examples of queries expressing AfP problems suggest that logic formalisms 
allow an easy formulation of queries. On the other hand, constraint programming 
systems permit an efficient execution. Therefore, AfPVatalog can also be used 
to define a logic interface for constraint programming solvers. We have imple- 
mented a system prototype which translates AfP T> atalog queries into OPL pro- 
grams, which are then executed by means of the ILOG OPL Development Studio 
\ILOG OPL Studio)) . The effectiveness of our approach is demonstrated by several 
experiments comparing AfPT>atalog with other systems. 

With respect to other logic languages previously proposed [Ca doli et al. 2 000: 
I Cadoli and Schaerf 2005\ \Eiter et al. TWA \Greco et al. 19951 \Simons et al. 2002} . 
the novelty of the paper is that it considers an answer set programming language 
able to express the complete set oiAfP decision, search and optimization problems, 
by using a restricted form of unstratified negation. 

Organization. The paper is organized as follows. Section 2 introduces syntax and 
semantics of DATALOG^and its ability to express AfP search and optimization 
queries under non-deterministic stable model semantics. Section 3 introduces the 
CATALOG -1 -"®'^ language, and shows its ability to express AfP search and opti- 
mization problems. Section 4 presents the AfP T> atalog language, obtained intro- 
ducing simple restrictions to DATALOG -1 "®^ and shows that AfP T> atalog has the 
same expressive power as DATALOG^"®'^ and DATALOG^. Section 5 illustrates how 
AfPVatalog queries can be translated into OPL programs and presents several ex- 
periments showing the effectiveness of the proposed approach. Section 6 discusses 
several related languages and systems recently proposed in the literature. Finally, 
conclusions are drawn in Section 7. 



1 The constructs here considered, essentially, force the use of a restricted form of unstratified 
negation. 
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2 DATALOG" 

It is assumed that the reader is familiar with the basic terminology and notation of 
relational databases and database queries \Abiteboul et al. 1995\ Wllman 1 988) . 

Syntax. A DATALOG" rule r is of the form A <— Bi, . . . , B m , -i5 m +i, . . . , ->B n , where 
A is an atom (head of the rule) and B%, . . . , B m , -^B m+ i, . . . , ->B n (with n > 0) is a 
conjunction of literals (body of the rule). A fact is a ground rule with empty body. 
Generally, predicate symbols are partitioned into two different classes: extensional 
(or EDB), i.e. defined by the ground facts of a database, and intensional (or IDB), 
i.e. defined by the rules of the program. The definition of a predicate p consists of 
all the rules (or facts) having p in the head. 

A database D consists of all the facts defining EDB predicates, whereas a 
DATALOG" program V consists of the rules defining IDB predicates. It is assumed 
that programs are safe Wllma n 1988\ . i.e. variables appearing in the head or in 
negative body literals are range restricted as they appear in some positive body 
literal, and that possible constants in V are taken from the database domain. For 
each rule, variables appearing in the head are said to be universally quantified, 
whereas the remaining variables are said to be existentially quantified. 

The class of all DATALOG" programs is simply called DATALOG"; the subclass 
of all positive (resp. stratified) programs is called DATALOG (resp. DATALOG"') 
\Abiteboul et al. 199$ . Observe that DATALOG C DATALOG"' C DATALOG" (the 
class of Datalog queries with possibly unstratified negation). 

Semantics. The semantics of a positive program V is given by the unique minimal 
model AiA4(V). The semantics of programs with negation V is given by the set of 
its stable models SM(V). An interpretation M is a stable model (or answer set) of 
V if M is the unique minimal model of the positive program V M , where V M denotes 
the positive logic program obtained from ground^) by removing (i) all rules r such 
that there is a negative literal -tA in the body of r and A is in M, and (ii) all the 
negative literals from the remaining rules \Gelfond an d Lifschitz 1988). It is well- 
known that a program may have n stable models with n > 0. Stratified programs 
have a unique stable model which coincides with the perfect model, obtained by 
partitioning the program into an ordered number of suitable subprograms (called 
'strata') and computing the fixpoints of every stratum in their order (Ullman 1988). 
Given a set of ground atoms S and an atom g(t), S[g] (resp. S[g(t))) denotes the 
set of ^-tuples (resp. tuples matching g(t)) in S. 

DATALOG" Search and Optimization Queries. Search and optimization problems can 
be expressed using different logic formalisms such as Datalog with unstratified nega- 
tion. 

Definition 1 

A DATALOG" search query is a pair Q = (V, g(t)), where V is a DATALOG" program 
and g(i) is an atom s.t. g is an IDB predicate of V . The answer to Q over a 
database D is Q(D) = {M[g(t)]\M e SM(P U £>)}. The answer to the DATALOG" 
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optimization query opt(Q) = (V, opt\g(t)\), where opt is either max or min, over 
a database D, consists of the answers in Q(D) with the maximum or minimum 
(resp., if opt = max or min) cardinality and is denoted by opt(Q)(D). □ 

Observe that, for the sake of simplicity, optimization queries computing the maxi- 
mum or minimum cardinality of the output relation are considered, although any 
polynomial function might be used. Therefore, the answer here considered is a set 
of sets of atoms. Possible and certain answers can be obtained by considering the 
union or the intersection of the sets, respectively. Instead of considering possible 
and certain reasoning, we introduce non-deterministic answers as follows. 

Definition 2 

A (non-deterministic) answer to a DATALOG^ search query Q applied to a database 
D is Q(D) = S where S is a relation selected non-deterministically from Q(D). 
A (non-deterministic) answer to a DATALOG^ optimization query opt(Q) over a 
database D is opt(Q)(D) = S where S is a relation selected non-deterministically 
from opt(Q)(D). □ 

It is worth noting that, like for search queries, also for optimization queries 
the relation with optimal cardinality rather than just the cardinality is returned. 
Thus, given a search query Q = (P,g(t)) and a database D, the output rela- 
tion Q(D) consists of all tuples g(u) matching git) and belonging to a stable 
model M of V U D, selected non-deterministically. For a given optimization query 
OQ = (V,min\g(t)\) (resp. (V, max\g(t)\)), the output relation OQ(D) consists 
of the set of tuples g(u) matching g(t) and belonging to a stable model M of 
VUD, selected non-deterministically among those which minimize (resp. maximize) 
the cardinality of the output relation. From now on, we concentrate our attention 
on non-deterministic queries. An example of a non-deterministic DATALQG^ search 
query is shown in Example [TJ the optimization problem is expressed by rewriting 
the query goal as (7 :> i,min|v(X)|} whose meaning is to further restrict the set of 
stable models to those for which v has minimum cardinality. 

In (jSacca 1997) and \Greco and Sacca 199f\ it has been shown that DATALOG" 
search and optimization queries under (non-deterministic) stable model semantics 
express the class of NV search and optimization problems (denoted, respectively, 
by QNPMV and OPTQNPMV). 

3 DATALOG^'®'^ 

The problem in using DATALOG - ' to express search and optimization problems is that 
the use of unrestricted negation is often neither simple nor intuitive and, besides, 
it does not allow expressive power (and complexity) to be controlled and in some 
cases might also lead writing queries having no stable models. In order to avoid 
these problems, we present a language, called DATALQG^"®'^, where unstratified 
negation is embedded into built-in constructs, so that the user is forced to write 
programs using restricted forms of negation without loss of expressive power. Speci- 
fically, DATALOG"*'®'^ extends DATALOG^' with two simple built-in constructs: head 
(exclusive) disjunction and constraints, denoted by © and respectively. 
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Syntax. In the following rules, Body(X.) and Body(K,Y, L) are conjunctions of 
literals, whereas X and Y are vectors of range restricted variables. 

An (exclusive) disjunctive rule is of the form: 

Pi(Xi) © • ■ • © p fc (X fc ) <- Body(X) (1) 

where Xj C X for all i £ [l..fc]. The intuitive meaning of such a rule is that if 
Body(¥i) is true, then exactly one head atom j)j(Xj) must be true. 

A special form of disjunctive rule, called generalized disjunctive rule, of the form: 
©l p(X,£)<- Body(X,Y,L) (2) 

is also allowed. In this rule the number of head disjunctive atoms is not fixed, but 
depends on the database instance and on the current computation (stable model). 
The intuitive meaning of this rule is that the relation defined by 7Tx5od?/(X, Y, L) 
(the projection of the relation Body(K,Y, L) on the attributes defined by X) 
is partitioned into a number of subsets equal to the cardinality of the relation 
niBodypi-, Y, L) (the number of distinct values for the variable L). Some examples 
of generalized disjunctive rules will be presented in the next section. 

A constraint (rule) is of the form: 

<= Body(X) (3) 

A ground constraint rule is satisfied w.r.t. an interpretation / if the body of the rule 
is false in /. We shall often write constraints using rules of the form A% V . . . V Ak <= 
Bi, . . . , B m (or Bi, . . . , B m =>■ A% V ... V Ak) to denote a constraint of the form 
•4= B\, . . . , B m , -i Ax, . . . , -ij4fc (i.e. negative literals are moved from the body to the 
head). For instance, the constraint <= edge(X, Y), _| v(X), ->v(Y) of Example can 
be rewritten as v(X) V v(Y) <= edge(X, Y) or as edge(X, Y) =>■ v(X) V v(Y). Here the 
symbol V denotes inclusive disjunction and is different from ©, as the latter denotes 
exclusive disjunction. It should be recalled that inclusive disjunction allows more 
than one atom to be true while exclusive disjunction allows only one atom to be 
true. 

Definition 3 

A DATALOG""®'^ search query is a pair Q = (V, g(t)), where V is a DATALDG~' S, ® ,<= 
program and g(t) is an IDB atom. A DATAL0G^"® ,<;= optimization query is a pair 
{V, opt\g(t)\), where opt is either max or min. □ 

The query (P 2 , v(X)) of Example[2]is a DATALOG^"®'^ search query, whereas the 
query (P2,niin|v(X)|} is a DATAL0G^ 3, ® ,<;= optimization query. 

Semantics. The declarative semantics of a DATALDG^"®'^ query is given in terms 
of an 'equivalent' DATALOG^ query and stable model semantics. Specifically, given a 
DATAL0G" a '®'^ program V, st(V) denotes the standard DATALDG" program derived 
from V as follows: 



1. Every standard rule in V belongs to st(V), 
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2. Every disjunctive rule r G P of the form ([I]) is translated into k rules of the 
form: 

PjCX.j)^- Body(X),^p 1 (X 1 ), . . . ,-ip J -. 1 (X i . 1 ),- 1 p 3 - +1 (X i+1 ), . . . ,->]>& (X fc ) 

with j G [l--fc], plus ((fe — 1) x k)/2 constraints of the form: 

<= Body(X),p i (X i ),p j (X j ) 

with i,j G [l..k] and z < j. It is worth noting that the constraints are necess- 
ary only if pj is defined by some other rule. 

3. Every generalized disjunctive rule of the form ([2]) is translated into the two 
rules: 

p(X,L) <- Body(X,Y,L), -,diff.p(X,L) 
diff.p(X, L) <- Body(X, Y, £), p(X, L'), V ± L 

where diff_p is a new predicate symbol and l! is a new variable, plus the 
constraint: 

<= Body(X,Y,L), p(X,Li), p(X,Za), L x ^ L 2 

Here diff_p is used to avoid inferring two ground atoms p(x, h) and p{x, h) 
with l\ ^ h - Observe that even in this case the constraint has to be introduced 
if p is defined by some other rule. 

4. Every constraint rule of the form ^ is translated into a rule of the form: 

c «— Body(X), -ic 

where c is a new predicate symbol not appearing elsewhere. 

For any DATALOG^'®'"^ search query Q = (P,g(t)) (resp. optimization query 
OQ = (V, opt\g{t)\)), st(Q) = (st(V),g(t)) (resp. st(OQ) - (st(V), opt\g{t)\)) 
denotes the corresponding DATALOG^ (resp. optimization) query. 

Definition 4 

Given a DATALOG^"®'^ query Q and a database D, the (non-deterministic) answer 
to the query Q over D is obtained by applying the DATALOG^ query st(Q) to D, 
i.e. Q(D) = st(Q)(D). □ 

It is worth noting that DATALOG^"'®'"^ has the same expressive power of 
DATALOG^, that is both MP search and optimization problems can be ex- 
pressed by means of DATALOG^"® '"^ queries under stable model semantics 
flZumpano et al. 2004$ . The further restricted languages DATALOG"^® (Datalog with 
stratified negation and exclusive disjunction) and DATALOG®'^ (Datalog with exclu- 
sive disjunction and constraints) have the same expressive power. A similar result 
has been presented in QEast and Trus zczynsk T2006§ , where it has been shown that 
positive Datalog with constraints and head (inclusive) disjunction, called PS logic, 
has the same expressive power as DATALOG^. Clearly, DATALOG®'"^ is captured by PS 
logic, since exclusive disjunction can be emulated by using inclusive disjunction and 
constraints. It is interesting to observe that analogous results could be obtained for 
others ASP languages. For instance, (positive) Datalog with cardinality constrains, 
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as proposed in Smodels, captures the expressive power of DATALOG^ since exclu- 
sive disjunction and denial constraints can be emulated by means of cardinality 
constraints (Nicmcla ct al. 1999). 

4 MPVatalog 

We now present a simplified version of DATALOG^"®'^ introducing further restric- 
tions on disjunctive rules. The basic idea consists in restricting DATALDG -1 ^®^ to 
obtain, without loss of expressive power, a language which can be executed more 
efficiently or easily translated in other formalisms. 

In the following rules, Body(X,Y) denotes a conjunction of literals where X and 
Y are vectors of range restricted variables. 

A partition rule is a disjunctive rule of the form: 

Pl (X) © • • • 8 Pk (X) <- Body(X, Y) (4) 
or of the form 

Po (X, a) © • • • e po(X, <%) <- Body(X, Y) (5) 

where po,pi, . . . ,pk are distinct IDB predicates not defined elsewhere in the pro- 
gram and Cx, . .., Cfc are distinct constants. The intuitive meaning of these rules is 
that the projection of the relation defined by Body{X, Y) on X is partitioned non- 
deterministically into k relations or k distinct sets of the same relation. Clearly, 
every rule of form (j^J) can be rewritten into a rule of form (j4j) and vice versa. 
A generalized partition rule is a (generalized) disjunctive rule of the form: 

© L p(X,L) <- Body(X,Y),d(L) (6) 

where p is an IDB predicate not defined elsewhere and d is a database domain 
predicate specifying the domain of the variable L. The intuitive meaning of such a 
rule is that the projection of the relation defined by BodyiX, Y) on X is partitioned 
into a number of subsets equal to the cardinality of the relation d. 
In the following, the existence of subset rules is also assumed, i.e. rules of the form 

s(X) C Body(X,Y) (7) 

where s is an IDB predicate not defined elsewhere in the program. Observe that 
a subset rule of the form above corresponds to the generalized partition rule with 
d = {0, 1}. On the other hand, every generalized partition rule can be rewritten 
into a subset rule and constraints. 

In the previous rules, Body(X,Y) is a conjunction of literals not depending on 
predicates defined by partition or subset rules. We recall that the subset rules used 
here are based on the proposal in \Greco and Sac ca 19~97^ . A similar type of subset 
rules has also been proposed by (Gclfond 2002) where the language ASET-Prolog 
(an extension of A-Prolog with sets) is presented. 
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Definition 5 

An AfP "D atalog program consists of three distinct sets of rules: 

1. partition and subset rules defining guess (IDB) predicates, 

2. standard stratified datalog rules denning standard (IDB) predicates, and 

3. constraints rules. 

where every guess predicate is defined by a unique subset or partition rule. □ 

According to the definition above, the set of IDB predicates of an AfV Datalog 
program can be partitioned into two distinct subsets (namely, guess and standard) 
depending on the rules used to define them. Clearly, predicates defined by partition 
or subset rules are not recursive as the body of these rules cannot contain guess 
predicates or predicates depending on guess predicates. 

Example 3 

Vertex cover (version 3). The AfP Datalog program: 

v(X) C node(X). 
edge(X,Y) =$> v(X) V v(Y). 

is derived from the one presented in Example [5] by replacing the disjunctive rule 
defining v with a subset rule. □ 

It is important to note that here a simpler form of DATALOG -1 ^®'^ queries is 
considered. Therefore, AfP Datalog C DATAL0G^"® ,<;= and every AfP D atalog query 
can be rewritten into an equivalent DATALOG - ' query. 

Definition 6 

An AfP Datalog search query is a pair Q — (V,g(t)), where P is an MP Datalog 
program and g(t) is an IDB atom denoting the output relation. An AfP Datalog 
optimization query is a pair {P, opt\g(t)\), where opt 6 {max, min}. □ 

Observe that, for the sake of simplicity, our attention is restricted to optimization 
queries computing the maximum or minimum cardinality of the output relation, al- 
though any polynomial function might be used. Moreover, as stated by the following 
theorem, MP Datalog captures the complexity classes of AfV search and optimiza- 
tion problems. 

Theorem 1 

1. search(MP Datalog) = QHVMS), and 

2. opt(NP Datalog) = OPTQNPAAV. 

Proof. Membership is trivial as MV Datalog C DATALOG -1 "®'^. 
To prove hardness the well-known Fagin's result is used \Fagin 19 74) (see 
also Uohnson 1990, \Papadimi trio u 1 994| ) ) : it states that every AfP recogniz- 
able database collection is defined by an existential second order formula 
where R is a list of new predicate symbols and $ is a first-order formula in- 
volving predicate symbols in a database schema DS and in R. As shown in 
flKoiaitis""a ncf Papadimi triou 1991\ , this formula is equivalent to one of the form 
(second order Skolem normal form) 
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(3S) (VX) (3Y) (9 1 (X, Y) V . . . V 9 k (X, Y)) 

where S is a superlist of R, 9\, . . . ,9 k are conjunctions of literals involving variables 
in X and Y, and predicate symbols in S and T>B. Consider the program V: 

Sj(Wj) ®Sj(Wj) *- (Vs 3 -G5) 
q(X) «- 9i(X, Y) (1 < i < k) 

9 - "■«(*) 

The first group of rules selects a set of constants from the database domain, for 
each predicate symbol sj . The second group of rules implements the above second 
order formula. The third rule checks if there is some X for which the formula is not 
satisfied. Therefore, the formula is satisfied if, and only if, there is a stable model 
M such that ->g 6 M. □ 

Thus, MWatalog has the same expressive power as both DATAL0G^"® ,<;= and 
DATALOG^. The idea underlying MWatalog is that MV search and optimization 
problems can be expressed using partition (or subset) rules to guess partitions or 
subsets of sets, whereas constraints are used to verify properties to be satisfied 
by guessed sets or sets computed by means of stratified rules. It is important to 
observe that the proof of Theorem [T] follows a schema which has been used in 
other proofs concerning the expressive power of Datalag with negation under stable 
model semantics ^Schlipf 19 95; S&cc h 1997\\Baral 2003j) . Indeed, in such proofs (see 
for instance the proof of Theorem 6.3.1 in fBaral 200^) negation is only used to 
express exclusive disjunction and constraints. 

Although the aim of this work is not the definition of techniques for the efficient 
computation of queries, we would point out that MVDatalog programs can be 
computed following the classical stratified fixpoint algorithm enriched with a guess 
and check technique. 

The advantage of expressing search and optimization problems by using rules 
with built-in predicates rather than standard DATALOG" 1 rules is that the use of 
built-in atoms preserves simplicity and intuition in expressing problems and al- 
lows queries to be easily optimized and translated into other target languages for 
which efficient executors exist. A further advantage is that the use of built-in pred- 
icates in expressing optimization queries permits us to easily identify problems for 
which "approximate" answers can be found in polynomial time. For instance, max- 
imization problems defined by constraint free MP Vatalog queries where negation is 
only applied to guess atoms or atoms not depending on guess atoms (called deter- 
ministic) are constant approximable (j Greco a nd Sacc a 1997} . Indeed, these prob- 
lems belong to the class of constant approximable optimization problems MAX sJl 
\Kolaitis and Thakur 1995 ) and, therefore, AfP V atalog could also be used to define 
the class of approximable optimization problems, but this is outside the scope of 
this paper. 

MV Vatalog allows us to also use a finite subset of the integer domain and the 



2 This class was firstly introduced in ( Papadimitriou and Yannakakis 1982) as MAX MV . 
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standard built-in arithmetic operators. More specifically, reasoning and computing 
over a finite set of integer ranges is possible with the unary predicate integer, which 
consists of the facts integer(x), with Minlnt < x < Maxlnt, and the standard 
arithmetic operators defined over the integer domain. 

The following example shows how the arithmetics operators could be used to 
compute prime numbers 

composite(X) <— integer(Y), integer(Z), X = Y * Z. 
prime(X) <— integer(X), not composite(X). 

Thus, the language allows arithmetic expressions which involve variables taking 
integer values to appear as operands of comparison operators (see Example [9]). 



Some examples are now presented showing how classic search and optimization 
problems can be defined in MW atalog. 

Example 4 

Max satisGability. Two unary relations c and a are given in such a way that a fact 
c(x) denotes that x is a clause and a fact a(v) asserts that v is a variable occurring 
in some clause. We also have two binary relations p and n such that the facts p(x, v) 
and n(x, v) state that a variable v occurs in the clause x positively or negatively, 
respectively. A boolean formula, in conjunctive normal form, can be represented by 
means of the relations c, a, p and n. 

The maximum number of clauses simultaneously satisfiable under some truth as- 
signment can be expressed by the query (Psat, max|f (X)|) where V sa _t is the following 
program: 



Observe that the max satisfiability problem is constant approximable as no con- 
straints are used and negation is applied to guess atoms only. 

In the following examples, a database graph G = (N, E) defined by means of the 
unary relation node and the binary relation edge is assumed. 

Example 5 

k-Coloring. Consider the well-known problem of k-colorability consisting in finding 
a k-coloring, i.e. an assignment of one of k possible colors to each node of a graph G 
such that no two adjacent nodes have the same color. The problem can be expressed 
by means of the MWatalog query (T'k-coi, col(X, C)} where V\- ca \ consists of the 
following rules: 

©c col(X, C) «- node(X), color(C). 
<^edge(X,Y), col(X, C), col(Y, C). 

and the base relation color contains exactly k colors. The first rule guesses an 



Examples 



s(X) C 
f(X) «- 
f(X) «- 



a(X). 

p(X,V), s(V). 
n(X,V), -s(V). 



□ 
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assignment of colors to the nodes of the graph, while the constraint verifies that 



Example 6 

Min Coloring. The query modeling the Min Coloring problem is obtained from the 
the k-coloring example by adding a rule storing the used colors as follows: 

©c col(X, C) <- node(X), color(C). 
^edge(X,Y), col(X,C), col(Y,C). 
used_color(C) <- col(X, C). 

and replacing the query goal with min|used_color(C)|. □ 

Example 7 

Min Dominating Set. Given a graph G — (N, E), a subset of the vertex set V C N 
is a dominating set if for all u E N — V there is a v E V such that (u, v) E E. 
The MWatalog query ("P ds ,v(X)} expresses the problem of finding a dominating 
set, where P ds is the following program: 

v(X) C node(X). 

connected(X) <— edge(X,Y), v(Y). 
node(X) A ->v(X) =^> connected(X). 

The constraint states that every node not belonging to the dominating set, namely 
the relation v, must be connected to some node in v. A dominating set is said to 
be minimum if its cardinality is minimum. Therefore, the optimization problem is 
expressed by replacing the query goal v(X) with min|v(X)|. □ 

Note that if an A/^-minimization query has an empty answer there is no solution 
for the associated search problem. 

Example 8 

Min Edge Dominating Set. Given a graph G = (N,E), a subset of the edge set 
A C E is an edge dominating set if for all e\ E E — A there is an e^ E A such that 
ei and e 2 are adjacent. The min edge dominating set problem is defined by the 
MWatalog query (V e d S , min|e(X, Y)|) where V e ds consists of the following rules: 



Example 9 

N-Queens. This problem consists in placing N queens on an N x N chessboard in 
such a way that no two queens are in the same row, column, or diagonal. It can 
be expressed by the MWatalog query ("Pqueen, queen(R, C)) where "Pqueen consists 
of the following rules: 

©c queen(R, C) <— num(R), num(C). 

<= queen(R 1; C), queen(R 2 , C), R± ^= R 2 - 

<= queen(R 1 ,C 1 ),queen(R 2 ,C 2 ),R 1 ^R 2 ,Ri + C! = R 2 + C 2 - 
<= queen(R 1 ,Ci),queen(R 2 ,C 2 ),Ri ^ R 2 ,Ri - Ci = R 2 - C 2 - 



two joined vertices do not have the same color. 



□ 



e(X,Y) C edge(X,Y). 
v(X)^e(X,Y). 
v(Y)^e(X,Y). 
edge(X,Y) => v(X) V v(Y). 



□ 
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The database contains facts of the form num(l) . . . num(N) for the N-queens problem. 
The partition rule assigns to each row exactly one queen. The first constraint states 
that no two different queens are in the same column. The last two constraints state 
that no two different queens are on the same diagonal. □ 

Example 10 

Latin Squares. This problem consists in filling an N x N table with N different 
symbols in such a way that each symbol occurs exactly once in each row and 
exactly once in each column. Tables are partially filled. The MP Vatalog query 
('Pis, square (R, C, V)} expresses the problem, where V\ & consists of the following 
rules: 

©v square(R, C, V) <— num(R), num(C), mim(V) 
•<= square(R, Ci, V), square(R, C 2 , V), Ci ^ C 2 
<= square(R l5 C, V), square(R 2 , C, V), Ri 7^ R 2 
square(R, C, V) <= preassigned(R, C, V) 

The database contains facts of the form num(l) . . . num(N) for an N x N table and 
facts of the form preassigned(R, C, V) whose meaning is that the entry (R, C) of the 
table contains the symbol V (here the symbols used are the numbers from 1 to N) . 
The partition rule assigns exactly one symbol to each entry of the table. The first 
(resp. second) constraint states that a symbol cannot occur more than once in the 
same row (resp. column). The last constraint states that preassigned symbols must 
be respected. □ 



5 Translating MP Vatalog Queries into OPL Programs 

Several languages have been designed and implemented for hard search and 
optimization problems. These include logic languages based on stable models (e.g. 
DeRes, DLV, ASSAT, Smodels, Cmodels, Clasp) iCholewinski et al. 1 9961 
ILeone et al. 2006\ \Lin and Zhao 2004 \Simons et al. 20021 \Lierler 2005k\ 
\Lierler 200 5b\ \Gebser et al. 2007}) . constraint logic programming systems 
(e.g. SICStus Prolog, ECLiPSe, XSB, Mozart) <\SICStus Prolog Web Site 
\Wallace and Schimpf 199§[ \Rao et al 1997\ \Van Roy et al. 199ty and con- 
straint programming languages (e.g. ILOG OPL, Lingo) \Van Hentenryck J988| 
\Finkel et al. 2004$ . The advantage of using logic languages based on stable model 
semantics with respect to constraint programming is their ability to express 
complex MP problems in a declarative way. On the other hand, constraint 
programming languages are very efficient in solving optimization problems. 

As MP 'Vatalog is a language to express MP problems, the implementation of 
the language can be performed by translating queries into target languages spe- 
cialized in combinatorial optimization problems, such as constraint programming 
languages. The implementation of MP Vatalog is carried out by means of a system 
prototype translating MP T> atalog queries into OPL programs. OPL is a constraint 
programming language well-suited for solving both search and optimization prob- 
lems. OPL programs are computed by means of the ILOG OPL Development Stu- 
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dio \ILOG OPL Studio). This section shows how MP Vatalog queries are translated 
into OPL programs. 

MP Vatalog programs have an associated database schema specifying the used 
database domains and for each base predicate the domain associated with each 
attribute. For instance, the database schema associated with the min coloring query 
of Example [6] is: 

DOMAINS : node; color. 
PREDICATES: edge (node, node). 

Starting from the database schema, the compiler also deduces the schema of 
every derived predicate and introduces new domains, obtained from the database 
domains. For instance, for the program of Example [5] the schcmas associated 
with the predicates col and used_color are, respectively, col(node, color) and 
used_color(color). Considering the program of Example [5] and assuming to also 
have the following rules: 

p(X) <- node(X). 

p(X) <- color(X). 

q(X) <- node(X), color(X). 

the schemas associated with p and q are p(D p ) and q(D q ) where D p is the union 
of the domains node and color, whereas D q is the intersection of the domains 
node and color. Database domain instances are defined by means of unary ground 
facts. Integer domains are declared differently. For instance, the database schema 
associated with the N-queens program of Example [9] is as follows: 

INT-DOMAINS : mim . 

Moreover, whenever the integer predicate is used in a program, the range of 
considered integers has to be specified in the schema, as shown in the following 
example: 

Minlnt = 0. 
Maxlnt = 10. 

A predicate p is said to be constrained if i) p depends on a guess predicate, 
and ii) there is a constraint or an optimized query goal containing p or contain- 
ing a predicate q which depends on p. Moreover, a constrained predicate is said 
to be recursion-dependent if it is recursive or depends on a constrained recursive 
predicate. 

Every MP Vatalog program P consists of a set Pg of standard rules, a set Pa 
of rules defining guess predicates and a set Vc of constraints. A program P = 
Ps U Pg U Pc can be also partitioned into four sets: 

1. P 1 = Pg consisting of the set of rules defining standard predicates not de- 
pending on guess predicates; 

2. P 2 = Pq U Pg U P c consisting of (i) the set of rules defining guess predicates 
(Pg), (h) the set of standard rules defining constrained predicates which are 
not recursion-dependent (Pg) and (iii) the set of constraints P c containing 
only base predicates and predicates defined in Pq LiP s L)P s ; 
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3. V z — V% U V\. consisting of the set of standard rules denning constrained, 
recursion-dependent predicates (Pf) and the set of constraints V% containing 
predicates defined in V%\ 

4. P 4 = Vg consisting of the set of rules defining standard predicates which 
depend on guess predicates and are not constrained. 

The evaluation of an MV V atalog program V over a database VB is carried out 
by performing the following steps: 

1. Firstly, the (unique) stable model of (Vg,T>B) (say it VBUMi) is computed. 

2. Next, a stable model of ("Pf U P G ,PB U Mi) satisfying the constraints V 2 C 
(say it VB U Mi U M 2 ) is computed. 

3. Afterwards, if a model VB U Mi U M 2 exists, the (unique) stable model of 
(Pf , VB U Mi I) M 2 ) (say it VB U Mi U M 2 U M 3 ) is computed. If this model 
satisfies the constraints Vq, then the next step is executed, otherwise the 
second step is executed again, that is, another stable model of (V s UVg,VBU 
Mi) satisfying the constraints V 2 C is computed. 

4. Finally, if a model VB U Mi U M 2 U M 3 satisfying the constraints V% exists, 
the (unique) stable model of {V S ,VB U Mi U M 2 U M 3 ) is evaluated. 

It is worth noting that, if there is no constrained recursive predicate, the component 
V 3 is empty and then an NV V atalog program can be evaluated by performing only 
steps 1,2 and 4 (that is, the iteration introduced in step 3 is not needed). 

The partition of programs into four components suggests that subprograms Vg, 
T>1 and V% can be evaluated by means of the standard fixpoint algorithm. In the 
following, stratified subprograms, such as V s , V% and Vg, are called deterministic 
as they have a unique stable model, whereas subprograms which may have zero or 
more stable models are called non-deterministic. Thus, given a database VB and 
an MV V atalog query Q = (F, G), we have to generate an OPL program equivalent 
to the application of the query Q to the database VB. 

We first show how the database is translated and next consider the translation of 
queries. 

Database translation. An integer domain relation is translated into a set of inte- 
gers, whereas a non-integer domain relation is translated into a set of strings. The 
translation of a base relation with arity n > consists of two steps: (i) declaring 
a new tuple type with n fields (whose type is either string or integer, according to 
the schema), (ii) declaring a set of tuples of this type. For instance, the translation 
of the database containing the facts node (a), node (b), node(c), node(d), edge(a, b), 
edge(a, c), edge(b, c) and edge(c,d), consists of the following OPL declarations: 

{string} node = {a, b, c, d}; 

tuple edge_type {string ai; string a2; }; 

{edge.type} edge = {(a,b), (a, c), (b, c), <c,d}}; 

The database num(l), num(2), num(3) for the N-Queens problem of Example [5] is 
translated as follows: 

{hit} num = {1, 2, 3}; 
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When an integer range is specified in the schema, the following set is added to 
the OPL database: 

{int} integer = asSet(MinInt .. Maxlnt); 

where Minlnt and Maxlnt are the values specified in the schema. 

Query translation. The translation of an MV T> atalog query Q — {V, G) is carried 
out by translating the deterministic subprograms into ILOG OPL Script programs 
by means of a function Fixp and the non-deterministic subprograms into OPL 
programs by means of a function Wp or a slightly different function Wq if the 
predicate in the query goal is defined in V 2 . More specifically, Fixp(V) generates 
an OPL script program which emulates the fixpoint computation of V, whereas 
WvCP) (resp. Wg(Q)) translates the MWatalog program V (resp. query Q) into 
an equivalent OPL program. 

It is worth noting that: 

1. If the query goal is not defined over component V 4 , this component does not 
need to be evaluated and, therefore, it is not translated into an OPL Script 
program. 

2. If the query goal is defined in component V 1 , we have to check that the 
components V 2 and V 3 admit stable models. 

3. If the query goal G is defined in component V 2 , we have to compute the 
query (V 2 , G) over the stable model (which includes the database) obtained 
from the computation of component V 1 and check that component V 3 admits 
stable models. 

4. Similarly, if the query goal G is defined in component V 3 , we have to compute 
the query (V 3 , G) over a stable model of V 1 U V 2 U VB. 

5. If the query goal G is defined in component V 4 , first we compute a stable 
model M for components V 1 , V 2 and V 3 and next compute the fixpoint of 
component V 4 over M. 

First, we informally present how a deterministic component (V s , V s and V s in 
our partition) is translated into an ILOG OPL Script program, and next we show 
how the remaining rules are translated into an OPL program. 

Translation of deterministic components. The translation of a stratified program 
Vs produces an ILOG OPL Script program which emulates the application of the 
naive fixpoint algorithm to the rules in Vs- 

The following example shows how a set of stratified rules is translated into an ILOG 
OPL Script program. 

Example 11 

Transitive closure. Consider the following MWatalog program Vtc computing the 
transitive closure of a graph: 

tc(X,Y) <- edge(X,Y)- 
tc(X,Y)<-edge(X,Z),tc(Z,Y)- 
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The corresponding OPL Script program Fixp(Vt c ) is as follows: 

1 1 tc declaration 
int tc [node] [node] ; 

execute{ 

// exit rule 
for (var x in edge) { 
tc [x.al][x.a2] = 1; 

} 

/ / recursive rule 
var modified = true; 
while (modified) { 
modified — false; 
for (var e in edge) 
for (var y in node) 

if (tc [e.a2][y] == 1 & tc [e.al][y] == 0) { 
tc[e.al][y] = 1; 
modified = true; 

} 

} 

} □ 

In the program above we have three sets of statements declaring variables and 
computing exit and recursive rules. Specifically: 

1. A two-dimensional integer array tc is declared. 

2. The first forall block evaluates the exit rule defining tc by inserting each 
edge into the transitive closure. 

3. The recursive rule is evaluated by means of the classical naive fixpoint algo- 
rithm ( Ullm an J988J) . Specifically, the statements inside the while block in- 
sert a pair (e.al, y) in the transitive closure, if there exist an edge (e.al, e.a2) 
and a node y such that the transitive closure contains the pair (e.a2, y). The 
loop ends when no more pairs of nodes can be derived. 

If a program contains negated literals, it is possible to apply the stratified fixpoint 
algorithm, by dividing the rules into strata and computing one stratum at a time, 
following the order derived from the dependencies among predicate symbols. 

Translation of non- deterministic components. The translation of a non- 
deterministic program V (denoted by Wv{V)) produces an OPL program. For 
the sake of simplicity of presentation, it is assumed that V satisfies the following 
conditions: 

• guess predicates are defined by either generalized partition rules or subset 
rules; 

• standard predicates are defined by a unique extended rule of the form: 

A <— body! V • • • V body m 
where bodyi is a conjunction of literals; 
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• constraint rules are of the form A <^ B, where A is a disjunction of atoms and 
B is a conjunction of atoms; 

• rules do not contain two (or more) occurrences of the same variable taking 
values from different domains; 

• constants appear only in built-in atoms of the form x y where 9 is a com- 
parison operator. 

It should be noticed that the previous assumptions do not imply any limitation 
as every program can be rewritten in such a way that it satisfies them. For instance, 
the two rules defining the predicate f in Example |4] can be rewritten into the rule 

f (X) <- (p(X, V), s(V)) V (n(X, Z), -,s(Z)) 

whereas the rules defining the predicate v in Example [5] can be rewritten in the 
form 

v(X) <- e(X,V) V e(U,X) 

Specifically, the function W-p receives in input a program V and gives in output 
an OPL program consisting of two components W-p(V) = (Tp('P), Tp('P)) where 
(i) Tj>(V) consists of the definition of arrays of integers and decision variables, (ii) 
Tp{V) translates the NV Vatalog program into an OPL program. Analogously, the 
function Wg receives in input an J\tP Vatalog query Q = (V, G) and gives in output 
an OPL program consisting of two components Wq((V, G)) = (Tx>(V), Tq({V, G))) 
where Tq((V, G)) translates the AfV Vatalog query into an OPL program. 
The function Tt>{V) introduces some data structures for each IDB predicate de- 
fined in V. Specifically, for each IDB predicate p defined in V 2 with arity k, a 
^-dimensional array of boolean decision variables is introduced as follows: 

dvar boolean p[Di, . . . , D k ]; 

where Di, . . . , D k denote the domains on which the predicate p is defined. For in- 
stance, for the binary predicate col of Example [HI the declaration 

dvar boolean col[node, color]; 

is introduced. For any other IDB predicate q defined in V with arity m, a Tri- 
dimensional array of integers is introduced as follows: 

int q[Di , . . . , D m ] ; 

where Di, . . . , D m denote the domains on which the predicate p is defined. 

The function Tq and T-p are defined as follows: 

1. Query: T Q ((V,G)) = T S (G) T V (T) 

2. Goal: 

(a) r Q (v(X 1; ...,X k )) = 

(b) 7' Q (min|v(X 1 ,...,X k )|) = 

minimize sum(Xi in dom(Xi), 

(c) T e (max|v(Xi, • • -,X k )|) = 

maximize sum(Xi in dom(X 1 ), 



. . . ,X k in dom(X k )) v[Xi, . . . ,X k ]; 
. . . ,X k in dom(X k )) v[X 1; . . . ,X k l; 
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3. Sequence of rules: Tp{S\ . . . S n ) = subject to{ T-p{S\) . . . T-p(S n )}; 

4. Partition rules of the form 

e L s(Xi , . . . , Xk, L) *— body(X! , . . . , X k , Y 1; . . . , Y„), d(L) 
are translated into the following OPL statement: 

forall(Xi in dom(Xi), . . . , Xk in dom(X k )) 

T P (3(Y 1 ,...,Y n ) body(X 1 ,...,X k ,Y 1 ,...,Y n )) > 
=> sum(L in d) s[Xi, . . . ,X k ,L] == 1; 
forall(Xi in dom(Xi), . . . , Xk in dom(Xk), L in d) 

s[X 1; ...,Xk,L] >0^Tp(3(Y!,...,Y n ) body(X 1 ,...,X k ,Y 1 ,...,Y n )) > 0; 

5. Subset rules of the form 

s(Xi, . . . , X k ) C body(X 1; . . . , Xk, Yt, . . . , Y„) 
are translated as follows: 

forall(Xi in dom(Xi), . . . , X k in dom(Xk)) 

s[Xi,...,X k ] > => T P (3(Y 1 ,...,Y n ) body(X 1 ,...,X k ,Y 1 ,...,Y n )) > 0; 

6. Standard rules of the form 

p(Xi,...,X k ) ^Bodyi(Xi,...,Xk,Yj,...,Yi 1 ) V--- VBody n (Xi,...,Xk,Y?,...,Y^) 
are translated as follows: 

forall(Xi in dom(Xi), . . . ,X k in dom(Xk)) 

p[X 1; ...,Xk] > 0^Tp(3(Y;,...,Yj 1 )Body 1 )+---+Tp(3(Y?,...,YS m )Body m ) > 0; 

where YJ, . . . , YJ. is the list of existentially quantified variables in Bodyi. 

7. Conjunction of literals with existentially quantified variables: A con- 
junction of literals with n > existentially quantified variables is translated 
as follows: 

T r {3{Y u Y n )Body) = (sum(Y 1 in D l7 . . . , Y n in D n ) (Tp(Body))) 
where Dj is the domain associated with the variable Yj . 

8. Conjunction of literals without existentially quantified variables: 

' (Tp(k 1 )*---*T P (k k )) if k> 

1 if k = 



7HAi,...,A k ) = | 

9. Literal: 

T P (q(X 1 ,...,X k )) = 



q[Xi, . . . ,X k ], if q is a derived pred. 

(sum((Xi, . . . , X k ) in q) 1 > 0), if q is a base pred. 

7p(q(X)) = (sum(X in q) 1 > 0) if q is a domain predicate, 

7p(Ei 9 E 2 ) = (Ei 9 E 2 ), where 9 is a comparison operator and Ei, E 2 are either 
variables or constants or arithmetic expressions, 

= (1 - 2HA)); 
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10. Constraints of the form Ai V- • -VA m <= body(Xi, X k ) where body(Xi, X k ) 
is a conjunction of atoms are translated as follows: 
T v { Ai V • • • V A m <= body(Xi, . . . , X k ) ) = 

forall(Xi in dom(Xi), . . . ,X k in dom(X k )) 

2>(body(Xi, . . . , X k )) > => (2>(Ai) + • • • + T^(A m )) > 0; 

For m = the above constraint becomes 7p(body(Xi, . . . , X k )) > false;. 

Observe that the OPL code associated with the translation of a rule can be 
simplified by means of trivial reductions. As an example, an expression of the form: 
((c > 0) > 0) can be simply replaced by (c > 0), whereas expressions of the form 
1*1 are replaced by 1. 

The following theorem shows the correctness of our translation. As we partition 
a program P into four distinct components P 1 , P 2 , P 3 and P 4 , where the com- 
ponents P 1 , P s and P 4 are computed by means of a fixpoint algorithm, whereas 
the components P 2 and Pq are translated into OPL programs, we next show the 
correctness of the translation of queries Q = (P, G), where P — P 2 , i.e. we assume 
that components P , P 3 and P are empty. Thus, such queries consist only of rules 
defining guess predicates, constraints and standard rules defining constrained pred- 
icates which are not recursion-dependent. Programs and queries of this form will 
be called 1Z-A/P Vatalog (restricted NP Vatalog). 

Theorem 2 

For every TZ-NP Vatalog query Q, Wq(Q) = Q. 

Proof. For each IZ-AfP Vatalog query Q = (P, G), where each standard predicate 
is defined by a unique extended rule, the query Q r = (P r , G r ) is derived as follows: 

1. every generalized partition rule of the form: 



© L s(Xi, . . . , X k , L) «- body(X 1; . . . , X k , Yi, . . . , 



Y n ),d(L) 



is substituted by the constraints: 



body(X 1; • • • , X k , Yj, . . . , Y n ) => s(Xi, . . . , X k , L) 
s(Xi, . . . ,X k , Li), s(Xi, . . . ,X k , L 2 ) Li = L 2 
s(X 1 ,...,X k ,L) ^body(X 1 ,...,X k ,Y 1 ,...,Y n ) 



2. each subset rule of the form: 



s(X 



...,X k ) Cbody(X 




is replaced by the constraint: 



s(X 1 ,...,X k ) =>body(X 




3. every standard rule of the form: 



A <— Body! V • • • V Body, 



m 
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is replaced by the constraint^: 

A Body! V • • ■ V Body m 

4. For each derived predicate p with schema p(domi, . . . , dom k ) we introduce (i) 
a new predicate symbol p' with schema p'(domi, . . . , dom k ), and (ii) a rule of 
the following form: 



These rules are introduced to assign, non-deterministically, a truth value to 
derived atoms. 

Clearly, the queries Q and Q r are equivalent as the correct truth value of derived 
atoms is determined by the constraints. It is worth pointing out that for each 
(partition, subset and standard) rule r a constraint of the form Head(r) Body(r) 
was introduced to guarantee that models contain only "supported atoms" , i.e. atoms 
derivable from r. 

The program Wq(Q) is just a translation of Q r into OPL statements where: 

• rules of form ([5]) do not need to be translated into correspondent OPL state- 
ments as each derived predicate p, defined by such a rule, is translated into a 
boolean fe-dimensional array. 

• the first two constraints, derived from the rewriting of partition rules, for 
ensuring (i) the assignment of each element in the body to some class L 
and (ii) the uniqueness of this assignment, are rewritten into a unique OPL 
constraint. □ 

Example 12 

Min-Coloring. The OPL program corresponding to the (simplified) translation of 
the min-coloring query of Example [S] is as follows: 

dvar boolean col [node , color] ; 
dvar boolean iised_color [color] ; 

minimize 

sum(c in color) used_color[c]; 

subject to { 

forall (x in node) 

(sum (x in node) 1 > 0) > => sum(c in color) col[x, c] —— 1; 

forall (x in node, c in color) 

col[x, c] > (sum(x in node) 1 > 0) > 0; 

forall (c in color) 

3 A shorthand for the two constraints: 



p(X 1; . . . , X k ) © p'(Xi, . . . ,X k ) 4- domi(Xi), . . . , dom k (X k ) 



(8) 



A =>■ Bodyi V ■ ■ ■ V Body, 
A Bodyi V ■ ■ ■ V Body, 



'n 



m 
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used_color[c] > sum(x in node) col[x, c] > 0; 

forall (x in node, y in node, c in color) 

(sum((x,y) in edge) 1 > 0) * col[x, c] * col[y, c] > =>■ false; 

}; ' ' ' □ 

Code optimization. The number of (ground) constraints can be strongly reduced 
by applying simple optimizations to the OPL code. 

• Range restriction. If the OPL code contains constructs of the form: 

forall(Xi in Di, . . . , X n in D„) 

(sum((X 1 , . . . , X k ) in T) 1 > 0) (Statement!) =>■ (Statement 2 ) 

the sum construct can be deleted so that the constraint can be rewritten as 
follows: 

forall({Xi, . . . , X k ) in T, X k+1 in D k+i , . . . , X n in D n ) 
1 (Statement].) => (Statement 2 ) 

If the OPL code contains constructs of the form: 

forall(Xi in Di, . . . , X n in D„) 

(sum(Xi in Di) 1 > 0) (Statement) 

the sum construct can be deleted so that the constraint can be rewritten as: 

forall(Xi in Di, . . . , X n in D n ) 
1 (Statement) 



Example 13 

By applying the optimizations above, the min-coloring problem can be rewrit- 
ten as follows: 



dvar boolean col [node , color] ; 
dvar boolean used_color [color] ; 

minimize 

sum(c in color) used_color[c]; 

subject to { 

forall (x in node) 

1 > => sum(c in color) col[x, c] == 1; 

forall (x in node, c in color) 
col[x, c] > => 1 > 0; 

forall (c in color) 

used_color[c] > sum(x in node) col[x, c] > 0; 

forall ((x,y) in edge, c in color) 

col[x, c] * col[y, c] > =>• false; 

}; 
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• Constraint optimization. A very simple optimization consists in deleting the 
OPL constraints whose head is always true (e.g. the head consists of the 
constant 1) as they are always satisfied. For instance, in the above example the 
second OPL constraint can be deleted as its head consists of the constant 1. 
An additional simple optimization can be performed by "pushing down" con- 
ditions defined inside the OPL constraints. For instance, the following code: 

Q (Xi in Di, . . . ,X k in D k ) (statement^ Xi#Xj =>■ (statement^) 

where Q is either forall or sum, Xt and Xj are either variables or constants 
or arithmetic expressions, 9 is a comparison operator, can be rewritten as 

Q (Xi in Di, . . . , X k in D k : Xi#Xj) (statement^ => (statement^) 

• Arrays reduction. A further optimization can be performed by reducing the 
dimension of the arrays (of decision variables) corresponding to some guess 
predicates. Specifically, given a guess predicate s defined by generalized par- 
tition rules of the form: 

8ls(X 1; . . . , X k , L) <- body(X l! . . . , X k , Y 1; . . . , Y n ), dom(L) 

instead of declaring a (fc-fl)-dimensional array of boolean decision variables, 
it is possible to introduce a fc-dimensional array s of integer decision variables 
ranging in {0, . . . , |dom|} and map each value in dom to {1, ... , |dom|} by means 
of a one-to-one function. The meaning of s[Xj, . . . , X k ] = c is that if c ^ then 
the atom s(Xi, . . . ,X k , c') is true, where c' is the value in dom corresponding 
to the integer c; if c — then the atom s(Xi, . . . , X k , c') is false for any value 
c' in dom. Clearly, to make consistent the OPL program, every instance of 
s[Xi, . . . , X k , C] must be substituted with (s[Xi, . . . , X k ] == C) and in each 
forall or sum statement containing variables ranging in dom the condition 
C ^ must be verified. 

Example H 

The application of the previous optimizations to the program of Example [13] 
gives the following OPL program: 

int cardcolor = card(color); 
range intcolor = .. cardcolor; 
dvar int col[node] in intcolor; 
dvar boolean used_color [intcolor]; 

minimize 

sum(c in intcolor : c / 0) used_color[c]; 

subject to { 

forall (x in node) 

1 > => sum(c in intcolor :c/0) (col[x] == c) > 0; 

forall (c in intcolor : c / 0) 

used_color[c] > •£> sum(x in node) (col[x]==c) > 0; 

forall ((x,y) in edge, c in intcolor :c/0) 
(col[x] == c) * (col[y] == c) > =>• false; 

}; ' □ 
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Variable deletion. A further optimization regards the deletion of unnecessary 
variables and the reduction of domains. For instance, in the last constraint in 
the OPL program of the previous example, the variable c can be deleted as 
it is just used to define the matching between col[X] and col[Y]. Thus, this 
constraint can be rewritten as: 

forall ((x,y) in edge) 

(col[X]== col[Y]) > => false; 

Observe that, if the body of the partition rule only contains database do- 
mains, the integer decision variables of the guess predicate can range in the 
set of integers {1, . . . , |dom|} as the head atom is true for all possible values of 
its variables Xi, . . . ,X k . This means that under such circumstances, it is not 
necessary to introduce the additional condition stating that the value of the 
variable cannot be 0. Under this rewriting, the first constraint can be deleted 
as its head is always satisfied. 

The following example shows the final version of the min coloring program, 
obtained by applying the optimizations above. 

Example 15 

Min Coloring (optimized version). 

int cardcolor = card(color); 
range intcolor = 1 .. cardcolor; 
dvar int col[node] in intcolor; 
dvar boolean used_color [intcolor]; 

minimize 

sum(c in intcolor) used_color[c]; 

subject to { 

forall (c in intcolor) 

used_color[c] > <4> sum(x in node) (col[x] == c) > 0; 

forall ((x, y) in edge) 

(col[x] == col[yj) > => false; 

}; 

The OPL program corresponding to the k-coloring problem consists of only 
one constraint, namely the second OPL constraint in the previous example. 

Aggregates. The current version of the paper does not include aggregates, although 
the language could be easily extended with stratified aggregates which can be ef- 
fortlessly translated into OPL programs. 

Consider, for instance, a digraph stored by means of the two relations node and 
edge and the following logic rule with aggregate^: 

out(X,C) «- edge(X,Y), count ((X),C) 
4 The syntax used refers to the proposal presented in {Greco 1999). 
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computing for each node X the number C of outgoing arcs. Such a rule could be 
easily translated into the following OPL script code: 

{string} node = . . . ; 

tuple edgejstring a; string b; }; 

{edge} edges = . . . ; 

int out [node]; 

execute} 

for (var e in edges) 

out[e.a] = out[e.a] + 1; 



In this paper, we have not considered aggregates since we would like to de- 
fine more efficient translations which allow us to express and efficiently compute 
greedy and dynamic programming algorithms. In the literature, there have been 
several proposals to extend Datalog with aggregates. For instance, the proposal 
of \Greco 1999) allows us to write rules with stratified aggregates and evaluate 
programs so that the behavior of dynamic programming is captured (see also 
(] Greco and Zaniolo 2001\ for greedy algorithms). 

Consider the query (SP, stc(X, Y,C)) computing the shortest paths of a weighted 
digraph, where SP consists of the following rules: 

stc(X,Y,C) <- tc(X,Y,C),min((X,Y),C)- 
tc(X,Y,C) <- edge(X,Y,C)- 

tc(X, Y, C) <- edge(X, Z, CI), tc(Z, Y, C2), C = CI + C2- 

and weights associated with arcs are positive integers. A standard translation and 
execution has two main problems: i) the computation is not efficient since for each 
pair of nodes all paths with different weights are considered, and ii) if the graph 
is cyclic the computation never terminates (or terminates with an error). Since 
shortest paths can be obtained by considering other shortest paths, an OPL Script 
computing them could be as follows: 

// declarations 

tuple edge{string a; string b; int c; }; 

{edge} edges = . . . ; 

{string} node = . . . ; 

int tc [x in node][y in node] = maxint; 

int stc [x in node][y in node] = maxint; 

execute { 

// exit rule 

for (var ei in edge) { 



}; 



tc[ei.a][ei.b] = 
stc[ei.a][ei.b] 




} 
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II recursive rule 
var modified = true; 
while (modified) { 
modified — false; 
for (var e in edges) 
for (var y in node) { 

tc[e.a][y] = e.c + stc[e.b][y]; 
if(tc[e.a][y] < stc[e.a][y]) { 
modified — true; 
stc[e.a][y] = tc[e.a][y]; 

} 

} 



Implementation and experiments 

A system prototype translating MV T> atalog queries into OPL programs and exe- 
cuting the target code using the ILOG OPL Development Studio has been imple- 
mented. The system architecture, depicted in Fig. [TJ consists of five main modules 
whose functionalities are next briefly discussed. 



Query (+ Schema) 







NP Daialog 
database 
.^^storage^^ 




1 



DB Compiler 




Fig. 1. System Architecture. 



• User Interface - This module receives in input a pair of strings identifying the 
file containing the source database and the file containing the query. If both 
the database and the query have already been translated, then the UI asks 
the module ILOG Solver to execute the query. If the database (resp. query) 
has not been translated, then the UI sends the name of the file containing the 
source database (resp. query) to the module Database Compiler (resp. Query 
Compiler) to be translated. Moreover, this module is in charge of visualizing 
the answer to the input query. 

• Database compiler - This module translates the source database into an OPL 
database. 
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• Query compiler - This module receives in input an MVT> atalog query and 
gives in output the corresponding OPL code. In order to check the correctness 
of the query and generate the target code, the module uses information on 
the schema of predicates. 

• Optimizer - This module rewrites the OPL code received from the module 
Query Compiler and gives in output the target (optimized) OPL code. 

• Query executor - This module consists of the ILOG OPL Development Stu- 
dio which executes the query stored by the module Optimizer into the OPL 
program storage, over a database stored into the OPL database storage. The 
module Query executor interacts with the module User Interface by providing 
it the obtained result. 

Therefore, MVVatalog can be also used to define a logic interface for constraint 
programming solvers such as ILOG. The experiments presented in this subsection 
show that the combination of the two components is effective so that constraint 
solvers (as well as SAT solvers) can be used as an efficient tool for computing logic 
queries whose semantics is based on stable models. 

In order to assess the efficiency of our approach, we have performed several ex- 
periments comparing the performance obtained by implementing MWatalog over 
the ILOG OPL Development Studio against Answer Set Programming systems. 
Specifically, MV T> atalog /OPL has been compared with DLV, Smodels, ASSAT, 
Clasp and XSB. The following version of the aforementioned systems have been 
used: 

• ILOG OPL Development Studio 6.1 \ILOG OPL Studio^ 

• DLV release 2007-10-11 ftDLV Web Sitejl 

• Smodels 2.33 (and lparse 1.1.1) (jSmodels Web Sitej) 

• ASSAT 2.02 (lparse 1.1.1 and zChaff 2007.3.12) ^ASSAT Web Site! liChaff| 

• Clasp 1.2.1 (and lparse 1.1.1) ^Clasp Web Sitel 

• XSB version 3.2 March 15, 2009 (|XgB Web Site^i 

The performances of the systems have been evaluated by measuring the time 
necessary to find one solution of the following problems: 3-Coloring, Hamiltonian 
Cycle, Transitive Closure, Min Coloring, N-Queens and Latin Squares. 

For each system, we have used efficient encodings of the problems which ex- 
ploit efficient built-in constructs provided by the systems. Every encoding and 
database used in the experiments can be downloaded from the NT "D atalog web 



site (http : //wwwinf o . deis .unical . it/npdatalog/). 



All the experiments were carried out on a PC with a processor Intel Core Duo 
1.66 GHz and 1 GB of RAM under the Linux operating system . In the sequel of 
this section the experimental results are presented. 

3 Coloring. The 3-Coloring query has been evaluated on structured graphs of the 
form reported in Fig. Wi}) and random graphs. Specifically, structured graphs with 
base = height have been used (here base denotes the number of nodes in the same 
row, height the number of nodes in the same column; the total number of nodes 
in the graph is base * height). The random graphs have been generated by means 
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of Culberson's graph generator (K-Colorablc grap h generator! ) . Specifically, the fol- 
lowing parameters have been used: K-coloring scheme equal to Equi-partitioned, 
Partion number equal to 3, Graph type is IID (independent random edge assign- 
ment). Both structured and random graphs arc all 3-colorable; the results, showing 
the execution times (in seconds) as the size of the graph increases, are reported in 
Fig. [3] and Fig. [H respectively. 





Fig. 2. Structured Graphs. 

As for structured graphs, the x-axis reports the number of nodes in the same layer 
(i.e. the value of base). MV T> atalog and DLV are faster than the other systems; 
ASSAT and Clasp have almost the same execution times (observe that the scale of 
the ?/-axis is logarithmic). 

Regarding random graphs, it is worth noting that we have considered, for each 
number of nodes, five different graphs. Thus, the execution times reported in Fig. [4] 
have been obtained by evaluating the query five times (over different graphs with the 
same number of nodes) and computing the mean value. MV T> atalog / OPL is faster 
than the other systems; again, ASSAT and Clasp have almost the same execution 
times. 
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Fig. 3. Execution time for the 3- 
coloring problem on structured 
graphs. 



Fig. 4. Execution time for the 
3-coloring problem on random 
graphs. 
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Hamiltonian Cycle. The Hamiltonian Cycle problem has been evaluated over 
benchmark graphs used to test other systems <\HC Instances^ and random graphs 
generated by means of Culberson's graph generator ( |ff C Program Archive ) . All the 
graphs have a Hamiltonian cycle. The AfP "D atalog encoding (as well as the encod- 
ings for the other systems) can be found on §RPV atalog Web site). The results are 
reported in Fig. [5] and Fig. [6j The x-axis reports the used graphs: a label nvXaY 
refers to a graph with X nodes and Y arcs. Observe that, in Fig. [5j a missing 
value means that the system has not answered in 30 minutes. Clasp is the fastest 
system for both types of graphs. DLV and Smodels are on average faster than the 
remaining systems. For large "dense" graphs Smodels outperforms DLV, but on 
some benchmark instances it runs out of time. 




Graph 




Graph 

-ASSAT + zChaff - 



Fig. 5. Execution time for the 
Hamiltonian Cycle problem on 
benchmark graphs. 



Fig. 6. Execution time for the 
Hamiltonian Cycle problem on 
random graphs. 



Transitive Closure. The Transitive Closure problem has been evaluated over di- 
rected structured graphs such as those reported in Fig. [7] Specifically, instances 
with base = height have been used (base denotes the number of nodes in the same 
row, height the number of nodes in the same column). 





Fig. 7. Directed structured graphs. 
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The results, which are reported in Fig. [8], show that DLV and XSB are faster than 
the other systems; ASSAT, Clasp and Smodels almost have the same execution 
times. 
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Fig. 8. Execution time for the Transitive Closure problem on directed structured 
graphs. 



Min Coloring. As for the Min Coloring optimization problem, we have used struc- 
tured graphs such as those of Fig. O Instances having the structure reported in 
Fig. [DJi) need at least three colors to be colored, whereas instances having the 
structure reported in Fig. HJii) need at least four colors to be colored. The number 
of colors available in the database has been fixed for the two structures, respec- 
tively, to four and five (one more than the number of colors necessary to color the 
graph). The results are reported in Fig. [9] and show that AfVDatalog outperforms 
DLV. A missing time means that DLV runs out of time (also in this case we had a 
30 minute time-limit). 
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Fig. 9. Execution time for the Min Coloring problem on structured graphs. 



N-Queens. We have considered empty chessboards to be filled with N Queens for 
increasing values of N. The results are reported in Fig. [101 Clasp is faster than 
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MV T> atalog which is in turn faster than ASSAT; for a high number of queens, DLV 
and Smodels become slower than the other systems. 
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Fig. 10. Execution time for the N-Queens problem. 



Latin Squares. We have considered partially filled tables which have been gener- 
ated randomly. In every table, 60% of the squares are empty. We have considered, 
for each table size, five different instances. Thus, the execution times reported 
in Fig. [11] have been obtained by evaluating the query five times (over different 
tables of the same size) and computing the mean value. The results show that 
MVT> atalog and Clasp are faster than the other systems. 
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Fig. 11. Execution time for the Latin Squares problem on random squares. 

The experimental results reported above show that our system only seems to 
suffer with programs where the evaluation of the deterministic components is pre- 
dominant. The reason is that deterministic components (often consisting of recur- 
sive rules) are translated into OPL scripts, which correspond to the evaluation of 
such components by means of the naive fixpoint algorithm, whereas problems which 
can be expressed without recursion (or in which the non-deterministic components 
are predominant) are executed efficiently The implementation of our system pro- 
totype could be enhanced by making more efficient the translation of stratified 
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(sub)programs or by using a different evaluator for these components. For instance, 
they could be evaluated by means of ASP systems, thus combining their efficiency 
in the computation of deterministic components with the efficiency of OPL in the 
computation of non-deterministic components. 

6 Related Languages and Systems 

Several languages have been proposed for solving MV problems. Here we have an- 
alyzed three different classes of languages: specification languages, constraint and 
logic programming languages, and answer set logic languages. 

Specification Languages 

Specification languages are highly declarative and allow the user to specify problems 
in terms of guess and check techniques. 

NP-SPEC (jCadoJi et al. 2000\ \Cadoli and Schaerf 2Wh} is a logic-based specifica- 
tion language allowing the built-in second-order predicates Subset , Partition, 
Permutation and IntFunc. The semantics of an NP-SPEC program is based on the 
notion of model minimality and the language upon which this semantics relies on 
is DATALOG CIRC , i.e. an extension of DATALDG in which only some predicates 
are minimized and the interpretation of the other is left open. An NP-SPEC pro- 
gram consists of two sections: the DATABASE section, specifying the instance 
and the SPECIFICATION section specifying the question. To make NP-SPEC 
executable, specifications are translated into SAT instances and then executed 
using a SAT solver. 

KIDS (Kestrel Interactive Development System) ( Smith 1990) is a semi-automatic 
program development system that, starting from an initial specification of the 
problem, produces an executable code through a set of consistency-preserving 
transformations. The problem is written in a logic based language augmented 
with set-theoretic data types and functional constraints on the input/output 
behavior. To make the language executable, specifications are firstly translated 
into CommonLisp and then into machine code. Before the compilation task, 
the user may select an optimization technique, such as simplification or partial 
evaluation, to obtain a more efficient target code. 

SPILL-2 (Specifications In a Logic Language) is the second version of an exe- 
cutable typed logic language that is an extension of the Prolog-like language 
Goedel (Kluzniak and Milkowska 1997). A specification in SPILL-2 consists of a 
set of type declarations, a set of function declarations, a set of predicate decla- 
rations and a number of logical expressions (queries) that are used to test the 
specification. A specification in SPILL is required to be "executable" in the sense 
that it is possible to "test" whether a provided solution is feasible w.r.t. a given 
specification. The execution of a program consists in evaluating each query in the 
context of the specification and reporting the result (true if the query succeeds 
and false otherwise). 
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Constraint and Logic Programming Languages 

The basic idea of constraint programming (CP) is to model and solve a problem 
by exploring the set of constraints that fully characterize the problem. Almost all 
computationally hard problems, such as planning, scheduling and graph theoretic 
problems, fall into this category. A large number of systems (more than 40) for 
solving CP problems have been developed in computer science and artificial intel- 
ligence: 

• Constraint Logic Programming, an extension of logic programming able 
to manage constraints, started about 20 years ago by Jaffar et al. 
{Jaff ar et al. 19921 . Several constraint logic languages allowing the formula- 
tion of constraints over different domains exist. Basically, all these languages 
embed efficient constraint solvers in logic based programming languages, such 
as Prolog. Here we cite, among the others, CLP \Marriott and Stuckey 1 998), 
SICStus Prolog tjSICStus Prolog Web Site] , BProlog (IZhou 20021 . ECLiPSe 
d Wallace and Schimpf 1999} and Mozart <\Van Roy et al. 1999\ . 

• ILOG OPL Development Studio \ILOG OPL Studio] , an integrated de- 
velopment environment for mathematical programming and combinato- 
rial optimization applications. The syntax of OPL is well-suited to ex- 
press optimization problems defined in the mathematical programming style 
(Van Hentenryck 1988\\Van Hentenryck et al. 1999) . 

• Constraint LINGO iFinkel et al. 20 04). a high-level logic-programming lan- 
guage for expressing tabular constraint-satisfaction problems such as those 
found in logic puzzles and combinatorial problems such as graph coloring. 

Several languages extending Prolog have been proposed as well. Most of these 
languages have been designed to provide powerful capabilities to represent and solve 
general problems and not to solve MV problems. Here we mention: 

BinProlog (BinProlog), a fast and compact Prolog compiler, based on the transfor- 
mation of Prolog to binary clauses. BinProlog is based on the BinWAM abstract 
machine, a specialization of the WAM for the efficient execution of binary logic 
programs. 

XSB ( _Ra o et al. 1997] . an extension of Prolog supporting the well-founded seman- 
tics (|Van Gelder et al. 2001) and including implementations of OLDT (tabling) 
and HiLog terms. OLDT resolution is extremely useful for recursive query com- 
putation, allowing programs to terminate correctly in many cases where Prolog 
does not. HiLog supports a type of higher-order programming in which predicate 
symbols can be variable or structured. 

An extension of classical first order logic, called ID-Logic, has been proposed in 
(Dcncckcr 2000) . Basically, in an ID-Logic theory, we can distinguish four different 
components describing i) data, ii) open predicates, iii) definitions and iv) assertions 
(or constraints). The relationships between ID-Logic and ASP has been studied in 
(|Marian et al. 2004} . where it has been also presented how ID-Logic theories can 
be translated into DATALOG -1 programs under ASP semantics. 
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Answer Set Programming Languages and Systems 

Several deductive systems based on stable model semantics have been developed 
too. Here we discuss some of the more interesting answer-set based systems and 
languages: 

DLV (Vienna Univ. of Technology and University of Calabria) (ILeone et al. 2006: 
\Eiter et al. 1997\i is a deductive database system, based on disjunctive logic pro- 
gramming. DLV extends Datalog with general negation, inclusive head disjunc- 
tion and two different forms of constraints: strong constraints, which must be 
satisfied, and weak constraints, which are satisfied if possible (preferred models 
are those which minimize the number of ground weak constraints which are not 
satisfied). For instance, the program of Example 1 is a DLV program, whereas 
by replacing exclusive disjunction with inclusive disjunction in the program of 
Example 2 we get a DLV program (the minimality of the models guarantees that 
every node cannot belong to both relations v and nv). The optimization query 
of examples 1 and 2 can be defined by adding the weak constraint :~ v(X) which 
minimizes the number of ground false weak constraints (i.e. v-tuples). 

Smodcls (Helsinki Univ. of Technology) ^Simons et al. 2002^ is a system for an- 
swer set programming consisting of Smodels, an efficient implementation of the 
stable model semantics for normal logic programs and lparse, a front-end that 
transforms user programs so that they can be understood by Smodels. 
Besides standard rules lparse also supports a number of extended rules: choice, 
constraint and weight rules. The formal semantics of all three types of rules can 
be defined through the use of weight constraints and weight constraint rules. In 
lparse the weight constraints are implemented as special literal types. Basically, a 
weight constraint is of the form: L < li = . . . , l n = w n < U where l\, . . . , /„, 
are literals, L and U are the integral lower and upper bounds, and w\, . . . , w n are 
weights of the literals. The intuitive semantics of a weight constraint is that it is 
satisfied exactly when the sum of weights of satisfied literals l\ , . . . , l n is between 
L and U, inclusive. A weight constraint rule is of the form Co <— C±, . .., C n 
where Co, . . . , C n are weight constraints. Besides the use of literals, lparse also 
enhances the use of conditional literals having the form: p(X) : q(X) where p(X) 
is any basic literal and q(X) is a domain predicate. 

Datalog Constraint. ( East and Truszczynski 2000) proposed a new nonmonotonic 



logic, called Datalog with constraints or DC. A DC theory consists of constraints 
and Horn rules (Datalog program) . The language is determined by a set of atoms 
At — Ate U Atn where Ate an d Atu are disjoint. Formally, a DC theory is a 
triple T = (Tc, Th, Tpc) where Tc is a set of constraints over Ate, Th is a set 
of Horn rules whose head atoms belong to Atn and Tpc is a set of constraints 
over At (post constraints). The problem of the existence of an answer set, for a 
finite propositional DC theory T, is A^P-complete flEast and Tr uszczynski 2000). 
ASSAT. (|Lin and Zjiao 2004) proposed a translation from normal logic programs 
with constraints under the answer set semantics to propositional logic. The pe- 
culiarity of this technique consists in the fact that for each loop in the program, 
a corresponding loop formula to the program's completion is added. The result 
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is a one-to-one correspondence between the answer sets of the program and the 
models of the resulting propositional theory. As in the worst case the number 
of loops in a logic program can be exponential, the technique proposes to add a 
few loop formulas at a time, selectively. Based on these results, a system called 
ASSAT(X), depending on the SAT solver X used, has been implemented for 
computing answer sets of a normal logic program with constraints. 
Cmodels fierier 2005a[ \Lierler 20051% is an answer set programming system that 
uses the frontend lparse and whose main computational characteristic is that it 
computes answer sets using a SAT solver for search. Cmodels deals with programs 
that may contain disjunctive, choice, cardinality and weight constraint rules. The 
basic execution steps of the system can be outlined as follows: (1) the program's 
completion is produced; (2) a model of the completion is computed using a SAT 
solver; (3) if the model is indeed an answer set, then the model is returned, 
otherwise the system goes back to Step 2. The idea is thus to use a SAT solver 
for generating model candidates and then check if they are indeed the answer 
sets of a program. The way Step 3 is implemented depends on the class of a logic 
program. 

Clasp ^Gebser et al. 200% is an answer set solver for (extended) normal logic pro- 
grams. It combines the high-level modeling capacities of answer set programming 
(ASP) with state-of-the-art techniques from the area of Boolean constraint solv- 
ing. In fact, the primary Clasp algorithm relies on conflict-driven learning, a 
technique that proved successful for satisfiability checking (SAT). Unlike other 
ASP solvers that use conflict-driven learning, Clasp does not rely on legacy soft- 
ware, such as a SAT solver or any other existing ASP solver. 

A-Prolog ( Gclfond 2002) is a logic language whose semantics is based on stable 
models, designed to represent defaults (i.e. statements of the form "Elements 
of a class C normally satisfy property P" ) , exceptions and causal effects of ac- 
tions ("statement F becomes true as a result of performing an action A"). In 
the same work, an extension of the language, called ASET-Prolog, is presented. 
Such an extension enriches the language with two new types of atoms: s-atoms, 
which allows us to define subsets of relations, and f-atoms, which allows us to 
express constraints on the cardinality of sets. An interesting application showing 
how declarative programming in A-Prolog can be used to describe the dynamic 
behavior of digital circuits is presented in tyBalduccini et al7l2 000) . 

Comparison with the other approaches proposed in the literature 

AfVDatalog is related i) to specification languages, for the style of defining prob- 
lems, ii) to answer set languages, for the syntax and declarative semantics, and iii) 
to constraint programming. 

Specification Languages 

The problem with specification languages is the tradeoff between the expressiveness 
of the formal notation and its execution. In general, specifications can be executed 
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only by blind search through the space of all proofs. A possible solution consists 
in adding (to specifications) refinements which improve the execution, but the re- 
sult could be a longer specification, containing details and, consequently hard to 
understand. 

NP-SPEC programs have a structure similar to the one of MVDalalog programs, 
although from the syntax point of view, the use of meta-predicates, in some 
cases, does not make programs shorter and more intuitive (see, for instance, the 
N-Queen problem reported in {Cadoli and Schacrf 200^). NP-SPEC uses in addi- 
tion to standard Datalog rules, also meta-predicates and set operators, whereas 
MVDalalog uses only standard Datalog rules with shortcuts for limited forms 
of (unstratified) negation. Moreover, although there is no difference in expressiv- 
ity, guesses in NP-SPEC are defined over base relations, whereas in MVDalalog 
they are defined over general 'deterministic' relations defined by stratified Dat- 
alog programs. As a further difference, the partition mechanism is more general 
and flexible in MP Datalog w.r.t. NP-SPEC as in the latter the number of par- 
titions is fixed. Concerning the semantics aspects, the declarative semantics of 
NP-SPEC programs is based on the notion of model minimality, whereas those of 
MP Datalog is based on stable models. 

KIDS results are "sensitive" to the implementation issue. Indeed, the KIDS sys- 
tem is semiautomatic: the user is asked to interact with the system in order to 
transform high level declarative specification into an efficient, correct and exe- 
cutable program. Moreover, the complexity of the final implementation in KIDS 
can result in dramatic improvements if specialized techniques are used. On the 
other hand, MP Datalog is a fully declarative language whose execution process 
is automatically optimized by the ILOG OPL Development Studio. 

SPILL-2 is not meant to use the specification of a problem in order to com- 
pute a solution, but to test the specification against some specific case, i.e. to 
verify whether a given specification implies certain intended properties or, in 
other words, if a specified property is consistent with the specification. As for 
differences, specification and queries in SPILL are compiled to Prolog, whereas 
our approach introduces specifications using MP Datalog and then performs the 
translation of queries into OPL programs. Moreover, MP Datalog is based on 
stable model semantics, whereas SPILL uses a pure first order semantics, i.e. it 
does not include any form of model minimization operations. As for a further 
difference, it is worth noting that SPILL does not provide a characterization of 
its expressive power and its complexity. 

Constraint and Logic Programming Languages 

Constraint Logic Languages, such as SICStus Prolog, ECLiPSe and BProlog, are 
extensions of Prolog and, therefore, they are not fully-declarative. Their semantics 
is based on top-down evaluation of queries (SLDNF resolution), whereas answer-set 
programming is based on bottom-up evaluation. XSB is an extension of Prolog with 
a declarative semantics (namely the well-founded semantics) based on top-down 
evaluation of queries (OLDT resolution) with tabling. Moreover, while answer-set 
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languages permit MP problems to be easily expressed (it suffices to translate their 
logic definition into logic programming rules), constraint logic programming lan- 
guages arc procedural and the efficient implementation of MP problems is hard and 
time-consuming. 

The relationship between ID Logic and MP Vatalog is strong since in our language 
we can also distinguish components describing data, guess predicates, standard 
rules and constraints, which correspond, respectively, to the ID logic components 
describing data, open predicates, definitions and constraints. Moreover, the aim of 
MP 'Vatalog is also the easy translation into different formalisms (other than ASP), 
including constraint programming languages. 

Answer Set Programming Languages 

The main difference of MP Vatalog with respect to DLV and Smodcls is that only 
restricted forms of (unstratified) negations, embedded into built-in constructs, are 
allowed. As a consequence, MP Vatalog is less expressive than DLV since the latter 
also uses (inclusive) disjunction and permits expression of problems in the second 
level of the polynomial hierarchy. The use of simpler languages such as MP Vatalog 
allows us to avoid writing non-intuitive queries which are difficult to optimize or 
translate in other formalisms for which efficient executors exist. It is important to 
observe that cardinality constraints and conditional literals of Smodels allows us 
to express both subset and (generalized) partition rules as defined in MP Vatalog. 
This means that in Smodels it is also possible to avoid using unstratified negation 
without losing expressiveness. We also note that s-atoms and f-atoms of ASET- 
Prolog enable us to express subset and (generalized) partition rules. 

MP Vatalog is also strongly connected to Datalog Constraint (DC), which is 
also based on stable model semantics. The main difference between MP Vatalog 
and DC consists in the fact that MP Vatalog forces users to write queries in a 
more disciplined form. In particular, DC guesses are expressed by means of con- 
straints (a guess is any set of atoms in Ate satisfying the constraint in Tc) and 
there is no clear separation between Tc (constraints used to guess) and Tpc (con- 
straints used to check). Moreover, DC only uses positive rules to infer true atoms, 
whereas MP Vatalog uses stratified rules. The expressive power of both languages 
captures the first level of the polynomial hierarchy. The experiments reported in 
d-East and Trusz czynski 2000) show that the guess and check style of expressing 
hard problems can be further optimized. 

The philosophy of ASSAT is similar to that of MP Vatalog: while in the ASSAT 
approach programs are translated in prepositional logic and then executed by means 
of a SAT solver, MP Vatalog programs are translated into OPL programs and then 
executed by using the ILOG OPL Development Studio. An approach similar to the 
one of ASSAT is adopted by Cmodels. 

A-Prolog is a general logic language (i.e. it allows function symbols, classical 
negation, head disjunction and subset rules) whose semantics is based on stable 
models. The aim of A-Prolog is the design of a general language for knowledge 
representation and causal reasoning, whereas MP Vatalog is a simpler language 
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(similar to the restricted FA-Prolog) which can be easily efficiently executed and 
translated in other formalisms. 

7 Conclusion 

MV search and optimization problems can be formulated as DATALOG^ queries under 
non-deterministic stable model semantics. In order to enable a simpler and more in- 
tuitive formulation of these problems, the NV T> atalog language has been proposed. 
It is obtained by extending stratified Datalog with constraints and two constructs 
for expressing partitions of relations, so that search and optimization queries can be 
expressed using only simple forms of unstratified negation. It has also been shown 
that MVT> atalog captures the class of AfP search and optimization problems and 
that AfVDatalog queries can be easily translated into OPL programs. An algo- 
rithm for the translation of MV "Datalog programs into OPL statements has been 
provided and its correctness has been proved. The proposed algorithm has been 
implemented by a system prototype which takes in input an MV Datalog query and 
gives in output an equivalent OPL program which is then executed using the ILOG 
OPL Development Studio. Consequently, NVD> atalog can also be used to define 
a logic interface for constraint programming solvers. Several experiments compar- 
ing the computation of queries by different systems have shown the validity of our 
approach. 
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